skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Nguyen, Giang"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Barr, Jeremy J. (Ed.)
    CRISPR-mediated interference relies on complementarity between a guiding CRISPR RNA (crRNA) and target nucleic acids to provide defense against bacteriophage. Phages escape CRISPR-based immunity mainly through mutations in the protospacer adjacent motif (PAM) and seed regions. However, previous specificity studies of Cas effectors, including the class 2 endonuclease Cas12a, have revealed a high degree of tolerance of single mismatches. The effect of this mismatch tolerance has not been extensively studied in the context of phage defense. Here, we tested defense against lambda phage provided by Cas12a-crRNAs containing preexisting mismatches against the genomic targets in phage DNA. We find that most preexisting crRNA mismatches lead to phage escape, regardless of whether the mismatches ablate Cas12a cleavage in vitro. We used high-throughput sequencing to examine the target regions of phage genomes following CRISPR challenge. Mismatches at all locations in the target accelerated emergence of mutant phage, including mismatches that greatly slowed cleavage in vitro. Unexpectedly, our results reveal that a preexisting mismatch in the PAM-distal region results in selection of mutations in the PAM-distal region of the target. In vitro cleavage and phage competition assays show that dual PAM-distal mismatches are significantly more deleterious than combinations of seed and PAM-distal mismatches, resulting in this selection. However, similar experiments with Cas9 did not result in emergence of PAM-distal mismatches, suggesting that cut-site location and subsequent DNA repair may influence the location of escape mutations within target regions. Expression of multiple mismatched crRNAs prevented new mutations from arising in multiple targeted locations, allowing Cas12a mismatch tolerance to provide stronger and longer-term protection. These results demonstrate that Cas effector mismatch tolerance, existing target mismatches, and cleavage site strongly influence phage evolution. 
    more » « less
  2. Influence Maximization (IM), which seeks a small set of important nodes that spread the influence widely into the network, is a fundamental problem in social networks. It finds applications in viral marketing, epidemic control, and assessing cascading failures within complex systems. Despite the huge amount of effort, finding near-optimal solutions for IM is difficult due to its NP-completeness. In this paper, we propose the first social quantum computing approaches for IM, aiming to retrieve near-optimal solutions. We propose a two-phase algorithm that 1) converts IM into a Max-Cover instance and 2) provides efficient quadratic unconstrained binary optimization formulations to solve the Max-Cover instance on quantum annealers. Our experiments on the state-of-the-art D-Wave annealer indicate better solution quality compared to classical simulated annealing, suggesting the potential of applying quantum annealing to find high-quality solutions for IM. 
    more » « less
  3. Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be very helpful; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing them. What challenges should automated repair tools address? What are the repair patterns whose automation could help developers? Which repair patterns should be assigned a higher priority for automation? This work presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from GitHub for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns; the most common bug fix patterns are fixing data dimension and neural network connectivity; DNN bug fixes have the potential to introduce adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and DNN bug localization, reuse of trained model, and coping with frequent releases are major challenges faced by developers when fixing bugs. We also contribute a benchmark of 667 DNN (bug, repair) instances. 
    more » « less
  4. Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages of deep learning pipeline are more bug prone? Are there any antipatterns? Understanding such characteristics of bugs in deep learning software has the potential to foster the development of better deep learning platforms, debugging mechanisms, development practices, and encourage the development of analysis and verification frameworks. Therefore, we study 2716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, root causes of bugs, impacts of bugs, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. The key findings of our study include: data bug and logic bug are the most severe bug types in deep learning software appearing more than 48% of the times, major root causes of these bugs are Incorrect Model Parameter (IPS) and Structural Inefficiency (SI) showing up more than 43% of the times.We have also found that the bugs in the usage of deep learning libraries have some common antipatterns. 
    more » « less
  5. Abstract Specialized or secondary metabolites are small molecules of biological origin, often showing potent biological activities with applications in agriculture, engineering and medicine. Usually, the biosynthesis of these natural products is governed by sets of co-regulated and physically clustered genes known as biosynthetic gene clusters (BGCs). To share information about BGCs in a standardized and machine-readable way, the Minimum Information about a Biosynthetic Gene cluster (MIBiG) data standard and repository was initiated in 2015. Since its conception, MIBiG has been regularly updated to expand data coverage and remain up to date with innovations in natural product research. Here, we describe MIBiG version 4.0, an extensive update to the data repository and the underlying data standard. In a massive community annotation effort, 267 contributors performed 8304 edits, creating 557 new entries and modifying 590 existing entries, resulting in a new total of 3059 curated entries in MIBiG. Particular attention was paid to ensuring high data quality, with automated data validation using a newly developed custom submission portal prototype, paired with a novel peer-reviewing model. MIBiG 4.0 also takes steps towards a rolling release model and a broader involvement of the scientific community. MIBiG 4.0 is accessible online at https://mibig.secondarymetabolites.org/. 
    more » « less